449 research outputs found

    Approximate Hamming distance in a stream

    Get PDF
    We consider the problem of computing a (1+Ï”)(1+\epsilon)-approximation of the Hamming distance between a pattern of length nn and successive substrings of a stream. We first look at the one-way randomised communication complexity of this problem, giving Alice the first half of the stream and Bob the second half. We show the following: (1) If Alice and Bob both share the pattern then there is an O(ϔ−4log⁥2n)O(\epsilon^{-4} \log^2 n) bit randomised one-way communication protocol. (2) If only Alice has the pattern then there is an O(ϔ−2nlog⁥n)O(\epsilon^{-2}\sqrt{n}\log n) bit randomised one-way communication protocol. We then go on to develop small space streaming algorithms for (1+Ï”)(1+\epsilon)-approximate Hamming distance which give worst case running time guarantees per arriving symbol. (1) For binary input alphabets there is an O(ϔ−3nlog⁥2n)O(\epsilon^{-3} \sqrt{n} \log^{2} n) space and O(ϔ−2log⁥n)O(\epsilon^{-2} \log{n}) time streaming (1+Ï”)(1+\epsilon)-approximate Hamming distance algorithm. (2) For general input alphabets there is an O(ϔ−5nlog⁥4n)O(\epsilon^{-5} \sqrt{n} \log^{4} n) space and O(ϔ−4log⁥3n)O(\epsilon^{-4} \log^3 {n}) time streaming (1+Ï”)(1+\epsilon)-approximate Hamming distance algorithm.Comment: Submitted to ICALP' 201

    Cell-Probe Bounds for Online Edit Distance and Other Pattern Matching Problems

    Full text link
    We give cell-probe bounds for the computation of edit distance, Hamming distance, convolution and longest common subsequence in a stream. In this model, a fixed string of nn symbols is given and one Ύ\delta-bit symbol arrives at a time in a stream. After each symbol arrives, the distance between the fixed string and a suffix of most recent symbols of the stream is reported. The cell-probe model is perhaps the strongest model of computation for showing data structure lower bounds, subsuming in particular the popular word-RAM model. * We first give an Ω((Ύlog⁥n)/(w+log⁥log⁥n))\Omega((\delta \log n)/(w+\log\log n)) lower bound for the time to give each output for both online Hamming distance and convolution, where ww is the word size. This bound relies on a new encoding scheme and for the first time holds even when ww is as small as a single bit. * We then consider the online edit distance and longest common subsequence problems in the bit-probe model (w=1w=1) with a constant sized input alphabet. We give a lower bound of Ω(log⁥n/(log⁥log⁥n)3/2)\Omega(\sqrt{\log n}/(\log\log n)^{3/2}) which applies for both problems. This second set of results relies both on our new encoding scheme as well as a carefully constructed hard distribution. * Finally, for the online edit distance problem we show that there is an O((log⁥n)2/w)O((\log n)^2/w) upper bound in the cell-probe model. This bound gives a contrast to our new lower bound and also establishes an exponential gap between the known cell-probe and RAM model complexities.Comment: 32 pages, 4 figure

    Element Distinctness, Frequency Moments, and Sliding Windows

    Full text link
    We derive new time-space tradeoff lower bounds and algorithms for exactly computing statistics of input data, including frequency moments, element distinctness, and order statistics, that are simple to calculate for sorted data. We develop a randomized algorithm for the element distinctness problem whose time T and space S satisfy T in O (n^{3/2}/S^{1/2}), smaller than previous lower bounds for comparison-based algorithms, showing that element distinctness is strictly easier than sorting for randomized branching programs. This algorithm is based on a new time and space efficient algorithm for finding all collisions of a function f from a finite set to itself that are reachable by iterating f from a given set of starting points. We further show that our element distinctness algorithm can be extended at only a polylogarithmic factor cost to solve the element distinctness problem over sliding windows, where the task is to take an input of length 2n-1 and produce an output for each window of length n, giving n outputs in total. In contrast, we show a time-space tradeoff lower bound of T in Omega(n^2/S) for randomized branching programs to compute the number of distinct elements over sliding windows. The same lower bound holds for computing the low-order bit of F_0 and computing any frequency moment F_k, k neq 1. This shows that those frequency moments and the decision problem F_0 mod 2 are strictly harder than element distinctness. We complement this lower bound with a T in O(n^2/S) comparison-based deterministic RAM algorithm for exactly computing F_k over sliding windows, nearly matching both our lower bound for the sliding-window version and the comparison-based lower bounds for the single-window version. We further exhibit a quantum algorithm for F_0 over sliding windows with T in O(n^{3/2}/S^{1/2}). Finally, we consider the computations of order statistics over sliding windows.Comment: arXiv admin note: substantial text overlap with arXiv:1212.437

    New Unconditional Hardness Results for Dynamic and Online Problems

    Get PDF
    There has been a resurgence of interest in lower bounds whose truth rests on the conjectured hardness of well known computational problems. These conditional lower bounds have become important and popular due to the painfully slow progress on proving strong unconditional lower bounds. Nevertheless, the long term goal is to replace these conditional bounds with unconditional ones. In this paper we make progress in this direction by studying the cell probe complexity of two conjectured to be hard problems of particular importance: matrix-vector multiplication and a version of dynamic set disjointness known as Patrascu's Multiphase Problem. We give improved unconditional lower bounds for these problems as well as introducing new proof techniques of independent interest. These include a technique capable of proving strong threshold lower bounds of the following form: If we insist on having a very fast query time, then the update time has to be slow enough to compute a lookup table with the answer to every possible query. This is the first time a lower bound of this type has been proven

    Upper and lower bounds for dynamic data structures on strings

    Get PDF
    We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length mm and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of reductions. As an example, we show that there does not exist an O(m1/2−Δ)O(m^{1/2-\varepsilon}) time algorithm for a large range of these problems unless the online Boolean matrix-vector multiplication conjecture is false. We also provide nearly matching upper bounds for most of the problems we consider.Comment: Accepted at STACS'1

    Silicon Burning II: Quasi-Equilibrium and Explosive Burning

    Full text link
    Having examined the application of quasi-equilibrium to hydrostatic silicon burning in Paper I of this series, Hix & Thielemann (1996), we now turn our attention to explosive silicon burning. Previous authors have shown that for material which is heated to high temperature by a passing shock and then cooled by adiabatic expansion, the results can be divided into three broad categories; \emph{incomplete burning}, \emph{normal freezeout} and \emph{α\alpha-rich freezeout}, with the outcome depending on the temperature, density and cooling timescale. In all three cases, we find that the important abundances obey quasi-equilibrium for temperatures greater than approximately 3 GK, with relatively little nucleosynthesis occurring following the breakdown of quasi-equilibrium. We will show that quasi-equilibrium provides better abundance estimates than global nuclear statistical equilibrium, even for normal freezeout and particularly for α\alpha-rich freezeout. We will also examine the accuracy with which the final nuclear abundances can be estimated from quasi-equilibrium.Comment: 27 pages, including 15 inline figures. LaTeX 2e with aaspp4 and graphicx packages. Accepted to Ap
    • 

    corecore